NPS ARCHIVE 
1969 
BOND, W. 



INVESTIGATION OF DISTORTION OF DIVERS' 
SPEECH USING POWER SPECTRAL ESTIMATES 
BASED ON THE FAST FOURIER TRANSFORM 

by 



William Howard Bond 



DUDLEY ^NOXUBRARY ^ 



DUDLEY KNOX LIBRARY 
NAVAL POSTGRADUATE SCHOOL 
MONTEREY, CA 93943-5101 



United States 

Naval Postgraduate School 




THESIS 



INVESTIGATION OF DISTORTION OF DIVERS 1 
SPEECH USING POWER SPECTRAL ESTIMATES 
BASED ON THE FAST FOURIER TRANSFORM 

by 

William Howard Bond 
and 

James Michael Myatt 



June 1969 



This document has been approved for public 
release and sale; its distribution is unlimited. 



33498 



LIBRARY 

NAVAL POSTGRADUATE SCHOOL 
MONTEREY,. CALIF.. 9394Q 



Thesis by: William H. Bond and James M. Myatt entitled "Investigation 

of Distortion of Divers 1 Speech Using Power Spectral Estimates Based on 
the Fast Fourier Transform." June 1969. 

ERRATA 



Page 


Line 


Change 


To 


58 


8 


Fig. 23 a-c 


Fig. 23 


61 


6 


Fig. 20 (a-c) 


Fig. 22 a-c 


61 


2nd paragraph, 
with the oral 


insert as first line - 
cavity is shown in Fig. 


- "The transient spectrum 
24." 


61 


3rd paragraph, 


insert as first line - 


- "Fig. 25 shows the transient 



spectrum at 80 ft. for the first 2048 points." 



Investigation of Distortion of Divers' 
Speech Using Power Spectral Estimates 
Based on the Fast Fourier Transform 

by 

William Howard^ond 
Major, United States Marine Corns 
B.S., University of Texas, 1957 

and 

James Michael Myatt 
Captain, United States Marine Corps 
B.S., Sam Houston State College, 1963 



Submitted in partial fulfillment of the 
requirements for the degree of 

MASTER OF SCIENCE IN ELECTRICAL ENGINEERING 
from the 

NAVAL POSTGRADUATE SCHOOL 
June 1969 



^ A f 
\'\ 0 > c \ 

' Of SV, \[ \ . 




ABSTRACT 



The problem of distortion in underwater communications peculiar 
to free divers and techniques for analysis of speech wave forms are 
discussed. The Fast Fourier Transform algorithm, selected to analyze 
shifts in formant frequencies due to restricted oral cavities, h i oh 
ambient pressures, and forced speech is discussed. The Fast Fourier 
Transform is used to analyze a vowel sound and show that the expected 
shifts do occur. Recommendations are made for extending the techniques 
to all non-noise like sounds and breathing mixtures other than compressed 
air. 
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I. INTRODUCTION 






Man's future needs dictate the development of the vast areas of 
the continental shelf for food, raw materials, and livlnq space. To 
explore, map, and colonize this unknown a myriad of new equipment must 
be developed for explorers of this "last frontier." Today man has the 
capability to communicate reliably over the vast reaches of snace; but 
free divers (Divers with no attachment to a habitat or surface station - 
untethered) operating in close proximity cannot communicate reliably to 
each other. One UDT officer summed up the state of the art by sayinq, 
"reliable communications between free divers consists of tugs on buddy 
lines." Unsatisfactory undersea communications result from poorly 
designed oral cavities, high ambient pressures, poor electronics, and 
exotic gas breathing mixtures. Current equipment has been further 
complicated by designers unfamiliar with the environment and hazards 
associated with the free diving world; few desiqners have taken into 
consideration the user's emotional response to the apparatus in the 
underwater environment. 

In 1959-60, the problem of inadequate communications between free 
divers was identified in a specific operational requirement of Navy UDT 
and Marine Corps Force Reconnaissance Divers. Since that time, the 
U. S. Navy and U. S. Marine Corps have been moving on separate paths to 
develop a reliable communicator for the free diver. Presently available 
equipment is far from an optimum design for many reasons — poor overall 
intelligibility being of prime importance. In short, the state of the 
art is unacceptable. This unacceptable condition, personal interest, 
and the requirement of the Armed Forces for more sophisticated devices 
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generated the original topic of this research - an investiaation into 
speech intelligibility of free diver underwater communicators. After 
visiting with persons engaged in Navy sponsored work in this realm, it 
became apparent that more was Involved than just better electronics. 
Inherent in improving underwater communications devices is the need 
for better understanding of what happens to speech in the undersea 
environment. This fact made it necessary to pursue a working knowledge 
of the disciplines of linguistics and speech processing. Repeated 
conferences with the Communications Sciences Laboratory, University of 
Florida stimulated the curiosity and study of these disciplines. 

Past speech processing has taken the form of qualitative analysis 
using analog equipment, such as spectographs , and intelligibility 
studies using human listeners; however, few quantitative studies have 
been undertaken to isolate the major distortion problems. Many compar- 
ative studies of hard line and free diver communication systems have 
been and are being carried out, primarily using human speakers and 
listeners to evaluate system intelligibility using controlled word 
lists [1]. The spectograph also gives a measure of quantitative anal- 
ysis to highly trained speech analysts. 

The use of frequency analysis (a common tool of the electrical 
engineer) for simple waveform analysis is a proven technique. In the 
past, this technique, transforming the signal under consideration to 
the frequency domain using the Fourier transform required too areat an 
expenditure of time and human effort for speech waveforms. The advent 
of digital computers with vast memory capabilities, now makes this 
technique a feasible method for speech waveform analysis. Transfor- 
mation of the speech waveforms is accomplished using the discrete 
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Fourier transform. This transformation is reliable if the signal is 
band limited and occurs for a finite period of time. The Fast Fourier 
Transform (FFT) alqorithm provides a rapid method for calculating the 
discrete Fourier transform of such signals [2], This algorithm has 
many applications to digital filtering and controlling real time proc- 
esses In the frequency domain. The value is apparent when considering 
signals that require 8,000 samples to provide complete coverage of 
frequencies of interest. It provides a real time saving of 300 to 1 
over previous algorithms; analysis that previously took hours now takes 
seconds. 

This Is the approach taken In this paper - Investigation of dis- 
tortion in small underwater communicators using the FFT to process 
diver's speech. The final value can only be Imagined; the analysis can 
be applied to isolate distortion in helium speech, build better helium 
speech unscramblers and help to prevent loss of vital undersea explorers, 
such as Berry Cannon. 

A. FREE DIVER COMMUNICATION 

Whenever man goes underwater, he needs communication. Tasks under- 
water are made easier and safer If communications exist between divers 
and/or surface stations. The military diver uses communications to 
carry out complicated tasks, to penetrate enemy held areas, and to effect 
rendezvous with underwater vehicles. Reliable underwater communications 
do not exist for all cases of interest to the U. S. Navy and U. S. Marine 
Corps. 

Communications between free divers is a relatively new need; but as 
man goes deeper for longer periods of time the requirement will increase 
in importance. Sonar has been used for communications between 



13 






submersibles and/or surface vessels for many years; small sonar sets 
are now performing the same service between divers, submersibles and/or 
surface stations. Many problems exist in the diver's case that are not 
of consideration in submarines. The more important include the addition 
of restricted oral cavities to the vocal tract, increased ambient pres- 
sures, increased noise levels, and the uses of exotic breathing mixtures. 

1 . Current Equipment 

Initially, efforts to provide the diver with communication 
involved amolifving the voice and projecting it into the underwater 
environment. Water noises near transmission frequencies, low power, and 
exclusion of microphones and earphones (for low cost) contributed to 
poor intelligibility and short range making these systems imnractical 
for most military uses [3]. Electromagnetic and electric field trans- 
mission have also been tried with limited success at short ranqes. 

Sonar has provided the most successful method of communication 
to date. Using crystals possessing electrostrictive properties, a sound 
pressure or acoustic wave is generated in to the water and received by a 
similar transducer. Sets of this design operate around a carrier fre- 
quency of 8KHz-42KHz (the lower carrier Droducing greater ranqe of 
transmission) and utilize AM-SSB/SC transmission to lower noise recep- 
tion and increase power. The Hydro-Products 811 and Bendix PDC-2 (see 
Fiqs.'l and 2) underwater communicators are industry's latest efforts in 
the field. Characteristics of these sets are shown in Table I. 

Changes occur in these sets when placed on the diver. The omni- 
directional transducer becomes somewhat directional due to acoustical 
dampening by the lungs, tanks, and the neoprene wet suit worn in cold 
water. Two divers operating in close proximity can not communicate when 
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Hydro Products 811 
Underwater Communicator 

FIGURE 1 




Bendix PQC-2 
Underwater Communicator 

FIGURE 2 
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TABLE I 



SET 


HP 811 


POC-2 


Carrier Frequency 


8.0875 KHz * 
AM-SSB/SC 


8.0875 KHz * 
AM-SSB/SC 


Electric Power 
Into Water 


.5 watt * 


.6 watt * 


Range 


nominal 800 meter * 


500 meter minimum 


Modes 


Voice 


Voice, Tone, and 
Interruoted Tone 


Transducer 


Electrostrictive 
(barium titamate) 


Electrostrictive 
(lead zirconate) 


Pattern 


Omnidirectional 


Omnidirectional 


Microphone 


Oral Cavity 
Microphone 


Bone conduction 
(two) 


Receiver 


Bone Conduction 
(one) 


Bone Conduction 
(two) 


Intelligibility 


No information 
(under test at 
U. of Fla.) 


No information 
(under test at 
U. of Fla.) 


Bandwidth 


300-3000 Hz * 
(2700 Hz) 


300-3000 Hz * 
(2700 Hz) 



* Manufacturer's Data 
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the lunq cavity is in the line of transducers. This is a serious 
defect when one user is unaware of the relative location of the second 
user. 

The communicator bandwidth operates on the telephone principle - 
300-3000 Hertz bandwidth will provide hi ah Intelligibility. Studies 
indicate for shallow depths that this bandwidth will provide intelli- 
gibility of 95+% [1]; but this Is not necessarily true for all conditions 
encountered by the diver. 

Evidence points to bone conduction as the primary source of 
hearing underwater [4]. All currently available communicators utilize 
bone conduction reception; one of the tested communicators used bone 
conduction transmission in place of the normal microphone in the oral 
cavity. Some tests Indicate a definite loss In intelligibility when 
using bone conduction transmission; but no quantitative results support 
these conclusions. The use of a single hose requlator with bone con- 
duction transmission caused the signal to be obscured by bubble noise 
passing the conductors. 

These problems are not unique to the tested communicators but 
a far more important set of problems are those common to all communi- 
cation systems for free divers. 

2. Problem Areas 

The most complex problem is that of speech distortion caused by 
the addition of a small cavity over the mouth. To date little basic 
research has been attempted in an effort to design an optimal oral 
cavity. The design must balance two conflicting variables - small 
volume to prevent low resonances near formants or articulated sound and 
buildup of carbon dioxide, and large volume to prevent increased 
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pressure from affecting the speaker in articulation. The Bioenqionics 1 
Nautilus oral cavity (see Figure 3) was used to aid speaker articula- 
tion in this research. This oral cavity has been used successfully to 
increase intelligibility of systems in other tests and nrovides an 
excellent trade off in cavity design and diver safety [5]. 

Increased operating depth results in increases of ambient pres- 
sures . This changes the characteristics of the resonant cavities and 
the density of the breathing gas. When a diver is oDeratinq at pres- 
sures of four atmospheres (100 feet) or greater, the voice attains a 
nasal quality and causes considerable loss of intelligibility [5]. 

When exotic breathing gases are substituted for the primarily 
nitrogen-oxygen gas, the voice attains a quality best described as the 
"Donald Duck" effect. The use of exotic gases causes formant shifts 
due to the increased sound velocities. Unfortunately the shifts are 
non-linear; this makes correction a complex problem. The amount of 
distortion due to these effects is under study by conventional tech- 
niques by the Communications Sciences Laboratory, University of Florida 
[1], and other aqencies [6,7,8]. 

Physiological problems are of interest to ensure that human 
engineering of communicators are acceptable to the diver when operating 
in an alien environment and commensurate with desired safety. The 
diver feels an almost personal relationship with his equipment. The 
communicator must require little or no chanqe in equipment for the diver 
to accept it unhesitatingly. A prime example is the addition of the 
oral cavity, a much needed item for proper articulation. The original 
design of the Bendix PDC-2 excluded use of a cavity for this reason. 
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Design of workable underwater communicators is feasible ; but a 
qreat deal of research is necessary before a sinqle system will over- 
come the vast number of problems. The most critical item Is development 
of a method to solve the deep depth and exotic qas speech distortion and 
then microminaturlzation of the "unscrambler" for Inclusion in the free 
diver's equipment. 

B. SPEECH PROCESSING 

To the engineer engaged in speech research, "speech processing" 
implies synthesis and analysis, automatic recognition and speech com- 
pression. The linguist will speak of the acoustical model in speech 
production or perception and/or various articulation and intelligibil- 
ity testing. The language of speech research is foreign to the beginner; 
therefore. Appendix A Includes a glossary of speech and linguistic terms. 

As previously mentioned, the requirement exists for a technique 
that will enable the speech analyst to accurately determine the effects 
of an alien environment, such as that encountered by the diver, on the 
human voice. 

Some of the techniques used for speech analysis are summarized with' 
a brief description of the new "wrinkle" in analysis following. 

1 . Analog Techniques 

a. Recognition Analysis 

The vowel sounds have been the subject of intense research. 
The most common result has been the classification of the vowels by 
their formants, primarily the relation of the formants to each other, 
and has been labeled "Visible Speech" [9], The device most commonly 
used is the "acoustic spectogram" where a visual representation of the 
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spoken message or phoneme is displayed in frequency versus time. 

The intensity, a third parameter. Is displayed by the darkness of the 
plot. Thus, the transient spectrum can be viewed. A spectoqram of the 
word "at" Is shown in Fig. 4. The chief disadvantage of the spectonram 
is the lack of resolution. 

b. Power Spectrum Measurements 

The analog spectrum analyzer is a convenient tool to view 
the spectrum of a signal instantaneously. It works on the principle 
of a constant width bandpass filter with a variable center frequency; 
however, high precision measurements are not possible. Another tech- 
nique that borders on beinq classed as digital is the use of a bank of 
narrow bandpass filters with increasing bandwidth as their center fre- 
quency increases. Until recently, size and cost have limited their 
use, with resolution still the determining factor. 

c. Waveform Analysis 

The fact that vowel sounds can be characterized by their 
waveform asymmetry has been noted. Pairs of phonemes can be compared 
and separated on the basis of the magnitude and polarity of the output 
of a demodulator [10]. 

d. Articulation and Intelligibility Tests 

To evaluate communication systems, the articulation tests 
were widely used during World War II. The results of the tests aided 
hardware design. Intelligibility testing takes the form of reading 
from phonetically balanced word lists and assigning a oercentaoe score 
for the correct responses of human listeners. The consistency of such 
tests demonstrates their reliability and makes them the most commonly 
used; however, the search for optimum lists of sounds, words and phrases 
continues [5], 
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Spectogram of the Word "AT 



i— 

2. Digital Techniques 

The analysis of speech by digital techniques owes a great 
deal of its popularity to the interest in the development of the 
"vocoder", or voice coder. Linguists and psycholinguists have been 
a trifle reluctant to master the techniques of the digital world [11]. 

Digital Spectrum analyzers prior to 1965 consisted of narrow 
band digital filters. The number of digital filters necessary for 
sufficient bandwidth and resolution required extensive memory and 
computing time; an excessive amount for speech waveforms. 

a. The Fast Fourier Transform 

The capability for high resolution spectral analysis was 

not available until recently discovered algorithms were implemented 

[12]. The algorithm of interest in this paper is the Fast Fourier 

Transform (FFT) [2]. It is an efficient algorithm for computing 

the "discrete Fourier transform". The number of operations is reduced 
2 

from N to 2Nlog2N, where N is the number of samples. Thus, it can 

be used for offline spectral analysis of processes. The advantage 

of the FFT is illustrated by comparing the number of operations re- 

2 

quired for N = 8192 samples. N is in excess of 67 million compared 
to 213 thousand for the FFT. Such a saving in operations vastly 
reduces the computing time. 

b. Spectral Measurements 

With the FFT as a tool available in generating the Fourier 
coefficients, several avenues of analysis are plausible. The avenue 
chosen is the power spectrum. Other spectra, such as the cesotrum, 
are available but were not considered. The FFT lends itself to such 



23 



uses and could be applied in future research in this area. By claiminq 
that the power spectrum is available, it must be clear that the spec- 
trum is only an estimate. How good the estimate is depends upon the 
methods of implementing the algorithm. First, the siqnal is of finite 
length, hence the number of samples is finite. Second, the signal 
is bandlimited. The assumption is made that the sianal is a member 
of a stationary process. 
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II. DATA COLLECTION 



After consultation with authorities in sneech orocessinq and 
underwater communications, and after considerinq time, funds, and 
human effort available, the decision was made to proceed to collect 
data to support two independent projects. This decision was influ- 
enced by the loan, free of cost, of the two underwater communicators 
used in the data collection. Certain control sounds and words, shown 
in Table II, were recorded to support the digital speech processing. 

Two vowel and four consonants were arbitrarily chosen. One sound 
of each main subgroup was included [9]. This data was orininallv 
intended to assist in isolating distortion in the two communicators. 

The second project was designed to assist in judgina the relative 
intelligibility of the two systems under various conditions. Two 
Campbell word lists were recorded by each diver. The Campbell word 
lists are monosyballic, phonetically balanced lists of 25 words each, 
selected randomly from a master list of 500 words. Table III is an 
example of such a list. 

Oriqinal nlans called for data to be taken in two locations, an 
anechoic pool free of outside interference and the ocean to Provide 
a noisy environment with varying conditions of range and depth. Sea 
conditions and boat availability in January and February, 1969 
allowed only eight (8) days to be used for taking data; althouqh the 
effort was made daily. Finally after 30% of the desired data had been 
collected, theft of equipment, lack of a stable platform in heavy seas, 
poor ocean visibility, and shortage of personnel dictated a change to 
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TABLE II 



Phoneme 

/AE/ 

A7 

/P/ 

/G/ 

/// 

/F/ 

/TH/ 

/V/ 



DOLLS 

NUTS 

BOOK 

JUMP 

ART 

CARVE 

OR 

ARE 



Control Sounds and Words 



Word 


Class of Phoneme 


AT 


Vowel 


MET 


Vowel 


PAY 


Voiceless Ston 


GO 


Voiced Ston 


SHOULD 


Voiceless Fricative 


FIRST 


Voiceless Fricative 


THIN 


Voiceless Fricative 


VOTE 


Voiced Fricative 



TABLE III 

Sample Campbell List (P-7) 



AIM 


COOK 


CHAIR 


AID 


SMOOTH 


AM 


ARM 


ACE 


POOR 


UP 


HIGH 


YES 


SKIN 


THREE 


LIE 


HURT 



GIVE 



26 



a quiet, fresh water environment havinq a stable platform. Two under- 
water communicators, the Hydro-Products 811 and the Bendix P0C-2 were 
used to transmit the data to an AN/W0C-1A communication set. Charact- 
eristics of this set anpear in Table IV. This receiver fed a Precision 
Instruments DA 6200 tape recorder. The data was recorded at 37.5 ips 
and stored on magnetic tape. All recording used frequency modulation 
to allow for subsequent linear frequency translation by chanqinq tape 
recorder sneed. 

A. AN ECHOIC POOL 

The anechoic pool located in Room 025 of Snanaqel Hall, Naval 
Postgraduate School, was utilized to provide the noise free environ- 
ment. The diver was dressed in normal diving equipment as shown in 
Fig. 5 and 6. Initially the diver sooke into free space as the 
control sounds, control words, and two Campbell word lists were 
recorded. Then an oral cavity was placed in position and the proce- 
dure repeated. A block diagram of the test setup is shown in Fig. 7. 

B. OCEAN RANGE 

The ocean range is located in Monterey Bay 1500 meters off Del Monte 
Beach. The location is shown in Fig. 8. The underwater Portion of 

this range is still located as indicated. The range is located in 100 

feet of water over a flat sandy bottom to minimize bottom reflection 

interference. Fig. 9 and 10 show the range layout. A diver platform 

is located at the main buoy to allow orientation of the diver toward 

the receiving transducer. This orienting device prevents siqnal loss 

due to attenuation by the lung cavities, wet suits, and other items of 

diver dress. Figure 11 shows a diver in the platform ready to 
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TABLE IV 



Characteristics of AN/WQC-1A Communication Set 



OPERATION 


AM/SSB-SC 


FREQUENCY 


8.3 - 11 KHz 


MODES 


Receive, Transmit, Tone 


TRANSDUCER 


Omnidirectional w/o Baffle 
15 db down 60°-330° w/ Baffle 


BANDWIDTH 


300-3000 Hz 
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Diver in Anechoic Pool 
Without Oral Cavity 

FIGURE 5 



Diver in Anechoic Pool 
With Oral Cavity 

FIGURE 6 
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Map of Ocean Range Location 
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FIGURE 10 







Diver in Orientation niatf^rm 
FIGURE 11 
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communicate. Fiaure 12 is a block diagram of the test set-up. Note 
that the taoe recorder is mounted on an unstable platform (the boat) 
and must remain within short ranqe (about 15 meters) of the desired 
buov ranqe marker. Table V indicates the desired conditions for 
record inq data. 



C. FRESH WATER RANGE 

The fresh water range was set up in San Vincente Reservoir, San 
Dieqo County, California. Figure 13 shows the site. It differs from 
the previously described ocean ranqe in four ways. 

1. The granite walls and concrete dam face made receiving 
transducer orientation critical to prevent side reflections and 
minimize reverberation. This was accomplished by susoendinq the 
transducer on one line and using two guidelines to insure exact 
positioning. 

2. The tape recorder was placed on a stable barqe and the 
diver moved to different locations, thus reversing the roles in the 
ocean . 



3. Ouiet, fresh water replaced the noisy ocean environment. 

4. Diver orientation was accomplished using a comoass. 

Table V shows the desired conditions for data collection. Eight 
days were required for movement to the reservoir, set up on the 
barge, collection of data and return to Monterey, California. 
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Ssn Vicente Reservoir Test Site 



FIGURE 13 
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TABLE V 





Recordinq Conditions 


for Ocean and 


Fresh 


Water Ranges 






DIVER 




BOND 


MYATT 




SET 


POC-2 


HP 811 PQC-2 


HP 811 


RANGE 


DEPTH 
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X - DATA RECORDED IN OCEAN RANGE 
Y - DATA RECORDED IN FRESH- WATER RANGE 
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III. DATA PROCESSING 



The instruments available at the Naval Postgraduate School for 
the nrocessing of the data include the IBM 360/67 corrmuter, the SDS 
9300 medium sized comDuter with a hybrid interface to a C0MC0R 5000 
analoq comouter. 

A. ANALOG TO DIGITAL CONVERSION 

The method of processing the data depended larqelv on the capa- 
bilities of the hybrid comouter setup of the COMCOR 5000 and the SDS 
9300. In order to digitize the data from the tapes, the samnlinn rate 
of the analoq to digital (A-D) converter had to be at least twice as 
high as the largest frequency of interest (the Nyquist frequency). 
Unfortunately, the present write scheme of the SDS 9300 limits the 
samole rate to 2500 Hertz. This virtually eliminates the use of the 
A-D converter for the entire audioband. It was determined that the 
frequencies of interest for this experiment slightly exceed the pass- 
bands of the two communicators, i.e., 300-3000 hertz; a oassband of 
200-4000 hertz was used. 

The sampling frequency dilemna was circumvented bv util izina the 
features of the FM tape recorder. The PI 6200 instrumentation tape 
recorder will record at three sneeds - 37.5 ins, 3.75 ips, and .375 
ios. The center frequency of the carrier is qoverned by the choice of 
the tane sneed - 50,000 hertz, 5,000 hertz and 500 hertz, respectively. 
Suitable lowpass filters are switched into the circuitry for these 
speeds. At 37.5 ips the tape recorder input yields 0 db gain from 
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dc to 10,000 hertz, with a signal to noise ratio of 42 db. By Dla.vina 
back at 3.75 iDs, the discriminator frequency is divided by a factor 
ten, as are the siqnal frequencies. This linear frequency translation 
allows a samole rate of 800 hertz to cover the translated audioband. 

The choice of parameters for sampling the data was hampered by 
the necessity for trade-off in frequency and time resolution, often 
referred to as the analogy to Heisenberqs "uncertainty orinciolc" [13]. 
Innorina statistical stability and those terms discussed in references 
on nower sDectra, the frequency resolution, Af, is defined as: 

Af = f s /N 

where f = sampling frequency 

N = number of samoles 

The dilemna is obvious. The larqer the number of samoles, the smaller 
the separation between spectral lines, hence, qreater resolution. The 
larner fl also extends the time duration the siqnal must be samnled, 
hence a loss of time resolution. To prevent aliasinq, the sampling 
frequency must be at least twice the highest frequency of the siqnal. 

The larqer samolinq frequency reduces the frequency resolution. 

The solution was to investigate the data for both time and fre- 
quency resolution. The time resolution was increased by using a lower 
number of samoles for calculating an "evolutionary spectrum" and a 
larger number of samples for a spectrum of high resolution. 

The sample rate was selected to be 2048 hertz. This frequency was 

selected because the FFT alqorithm is most efficient when f is some 

s 

number that is a power of two. It is more than five times the hiqhest 
frequency of interest; i.e., 400 hertz (4,000 hertz translated by a 
factor of ten) . 
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The number of samples taken must necessarily be a number that is 
a power of two for the FFT algorithm. Thus, the number of samoles taken 
for each record was 8192. This allowed the sional to be samoled for 
a period of 0.4 seconds. The maximum resolution possible is 0.25 
hertz, translated by the factor of ten to 2.5 hertz, real time. 

To samnle the data, the tape recorder output was filtered by a 
Krohn-hite adjustable bandpass filter, Model 310 A. The character- 
istics of this filter are shown in Table VI. The frequency limits 
were 20 and 400 hertz. The siqnal was then amplified by a factor of 
30, and diqitized with the clock rate controlled bv the Wavetek 116 
oscillator. The block diagram of Fiq. 14 shows the samplinq setup. 

The siqnal was monitored by audio means and visually on an oscillo- 
scope to determine the proper time to commence diaitizing. Herein 
lies a fundamental weakness in this experiment; it will be discussed 
more fully in the analysis section. Briefly, no control was used to 
ensure that the sampling process started at the same instant the 
phoneme commenced. 

A vast amount of data was recorded for analysis. With time 
becominq a factor, it was necessary to limit the number of phonemes 
exami ned . 

The phoneme selected was |AE|, the vowel sound in "at" - a front 
vowel. The same phoneme was sampled for the followinq conditions - 
each communicator at a range of ten meters and depth of 80 feet, and 
each communicator in the anechoic pool with and without oral cavity. 

The sampled data from the SDS 9300 was placed on magnetic tape. 

The output was seven track in 24 bit octal format. The samnle routine 
was an in-house routine developed by former NPGS students [14,15]. 
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TABLE VI 



Characteristics of Model 31 OA Krohn-hite Filter 




a 



f 

90 



KROHN-HITE MODEL 31 OC 
NORMALIZED FILTER RESPONSE 

Attenuation: 24 db/octave below -12db point 

Input Characteristics: 6 Megohms in parallel with 50 pyf 

5 volts rms rpaximum 

Output Characteristics: 200 ohms, 12 milliwatts 
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FIGURE 14 



B. INTERFACING 



The output of the SDS 9300 is written in seven track. As is the 
case with all comDUters when tryinq to mate with the IBM 360, the 
interface is not as smoothly refined as might be exnected. The seven 
track tape must be converted to nine tracks and the 24 bit, octal 
format converted to 32 bit, hexadecimal format. The oroqram required 
is listed in Appendix B. 

The tane drives of the IBM 360 required perseverance on the part 
of the onerators. The tane from the SDS 9300 was accented rouqhly 
one out of five tries. 

No attempt was made to store this data on a direct access device. 
However ,this seems desirable to conserve computer time. 

C. DIGITAL PROCESSING 

The data was stored on maqnetic tane in 32 bit words. The sub- 
routine Harm, oart of the Standard Scientific Package for the IBM 360/67, 
was used to calculate the estimates for the Fourier coefficients of 
the siqnal. This routine is based on the Cooley-Tukey alqorithm. 

As previously mentioned, the entire 8192 data noints could be used 
to compute a snectrum of high frequency resolution but with time uncer- 
tainties. Alternately, only a portion of the data noints could be 
used with decreased frequency resolution but with increased certainty 
in the time domain. 

The techniques complement each other. The technique chosen computed 
a) a power spectrum using the entire 8192 points, and, b) to assist in th 
analysis of this spectrum , computed a transient snectrum for the first 
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2048 points. The two spectra are titled "spectrum" and "transient 
snectrum." The spectrum ha s a resolution of 2.5 hertz. The transient 
spectrum has a frequency resolution of ten hertz. 

The ootion exists whether or not a smoothinq function is necessary. 

The need for a smoothinq function might be described by mentally 

visualizing the side lobes of the sample function. These side lobes 

cause "leakage" from one sample to another. A smoothinq function 

such as the now familiar "Hanning spectral window" reduces the side 

-1 -3 

lobe leakage from (w-w 1 ) to (w-w 1 ) [16,17]. Whether or not the 

smoothinq function is desirable depends on the tyoe of signal. The 
literature says that noise-like signals with relatively flat spectrum 
are not suitable for smoothinq functions, but when the signal is non- 
noise-like and the oosition of signal energy in the frequency domain 
is more imDortant than the amount of energy, then a smoothing function 
is aoDroDriate. The "Hanninq window" was used. The estimates of the 

Fourier coefficients were operated on as follows (for M coefficients), 

A(k) = -J* a ( k+1 ) -?s a ( k- 1 ) + \ a(k) 

B(k) = -?* b ( k+1 ) -k b ( k- 1 ) + \ b(k) 

A ( 0 ) = %(a(0) - a(l )) 

B(0) =0 k = 1,2, 3, 4 

At the exnense of a reduction in "statistical stability", this 
window was used; its time domain equivalent being, 

X(j) = % (1 -cos(2 j/n ) ) x(j) j = 0,1 ,2, . . . ,n-l 

A second smoothinq operation was performed after the coefficients of 
the power spectrum were computed. 

P(k) 2 = A(k) 2 + B(k) 2 k = 0,1,2 N/2 

P(k) = ' J (p(k+1) + n(k-l)) + \ n(k) k = 0,1 ,2, . . . , M / 2 
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This is again the familiar "Hanning window" [17]. Since the main 
concern is that the estimates are anprooriate to the nominal fre- 
quencies, the linearly hanned spectrum suits its purnose. 

The Dower spectra were clotted on a loqarithmic scale, allowinq 
a forced scale suitable for the entire dynamic ranoe of data noints. 



46 



IV. ANALYSIS 



The phoneme selected for study was not an arbitrary choice. The 
noise-like sounds such as fricatives or stops do not lend themselves 
to Fourier analysis [18]. The vowel sounds, however, have a steady 
state quality, with slowlv chanqinq transient SDectra. The front 
vowel, / AE/ as in "at", demonstrates a well defined snectrum with 
three important formant frequencies. The literature shows qeneral 
agreement on the values of these formants: 

F-| = 660 hertz 

= 1720 hertz 

F^ = 2410 hertz [19] 

Associated with these formants are bandwidths of 60, 100, and 120 
hertz, respectively. One reference shows the formants to be somewhat 
lower. 

F-| = 579 hertz 

F 2 = 1691 hertz [20] 

A. FORCED AND NORMAL SPEECH 

In order to establish some sort of reference, a diver's voice was 
recorded on a Kay Missilyzer soectooranh. Fiqure 15 shows the front 
vowel /AE/ recorded in an anechoic chamber. An Altec 21BR150-2 micro- 
phone was used and the diver spoke in a normal voice. Fiqure 16 is 
the same sound spoken in a loud, forceful manner, just as if the diver 
were using an underwater communicator. The requirement for the forced 
soeech to increase intelliqibilitv has been documented [1]. 
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FIGURE 15 
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FIGURE 16 



The formant frequencies for the normal sneech are shown with 
arrows in Figure 15. They generally aqree with the values cited 
previously. Figure 16 shows the formants greatly increased. The 
effort to increase the level of sound has caused the diver to lift the 
pitch of his voice in an unnatural manner. The snectrum is not constant 
but rises raoidly to a maximum value for each formant and then slowlv 
decays. This change in the spectrum is of major importance. Further 
verification of this difference in the forced and normal speech oower 
spectral estimates is borne out by Figures 17 (a-c) and 18 (a-c). 

The normal sDeech power spectral output closely approximates a line 
spectra where shifts can be easily measured. 

B. FREQUENCY TRANSLATION 

To check the frequency translation of the PI-6200 tane recorder, 
a 384 hertz tone was recorded at 37.5 ios and sampled at the rate 
of 2048 hertz. The power spectrum is shown in Figure 19. Figure 20 
shows the power spectrum when the tape recorder speed is slowed to 
3.75 ips. The expected change in frequency by a factor of ten occurred. 

C. POWER SPECTRAL ESTIMATES 

1 . Without Oral Cavity 

A sample spectrum for the recording made using the commun- 
icators without oral cavitv is shown in Figure 21(a-c). The spectrum is 
cluttered with oower at frequencies in the neighborhood of the for- 
mants. This clutter is sufficient to prevent accurate measurement of 
the formants; but should be anticipated in light of the observations 
made concerning the rapid changes in the snectrum in Figure 16. This 
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FIGURE 17(a) 

Normal Speech Spectrum of /AE/, 0-1000 Hz. 
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FIGURE 18(a) 

Forced Speech Spectrum of /AE/, 0-1000 Hz. 
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Power in db. Power in db. 
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FIGURE 17(b) 

Normal Speech Spectrum of /AE/, 1-2 KHz. 




Forced Speech Spectrum of /AE/, 1-2 KHz. 
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FIGURE 17(c) 

Normal Speech Spectrum of /AE/, 2-3 KHz 
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FIGURE 18(c) 

Forced Speech Spectrum of /AE/, 2-3 KHz. 
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Power in db. Power in db. 
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384 Hz. Tone at 37.5 ips 
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FIGURE 20 

384 Hz. Tone at 3.75 ips 
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Power in db. Power in db. 




Phoneme /AE/, Without Oral Cavity, 0-1 KHz. 
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FIGURE 22(a) 

Phoneme/AE/ , With Oral Cavity, 0-1 KHz. 
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Power in db. Power in db. 
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FIG. 21(b) 

Phoneme /AE/, Without Oral Cavity, 1-2 KHz. 




Phoneme /AE/, With Oral Cavity, 1-2 KHz. 
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FIG. 21(c) 

Phoneme Without Oral Cavity, 2-3 KHz 
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FIG. 22(c) 

Phoneme /AE/, With Oral Cavity, 2-3 KHz. 
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spectrum was estimated using 8192 samole points. Theoretically, the 
frequency resolution is 2.5 hertz. In reality, only a rough estimate 
can be made to determine the formants within 50 hertz. Blaminq the 
poor results on the manner in which the diver forced the sound, the 
next step was to compute an 'evolutionary spectrum', (transient 
spectrum). Using the first 2048 samnle points, the theoretical resolu- 
tion is ten hertz. The transient spectrum for the first 2048 noints 
is shown in Figure 23a-c. This covers a signal for 0.1 second. The 
obvious blunder is that no technique was used to ensure that the 
sampling process began precisely at the beginning of the sound. This 
is critical in the transient spectrum, in that an additional uncer- 
tainty has been introduced. That fsj does tfie samolinq start before 
the snectrum reaches the highest values in frequency? If the siqnal 
had been the phoneme recorded in the normal voice, this would not 
have been critical since the snectrum is flat for a reasonable period 
of time. 

Since the data was already recorded, it was not possible to 
record on an adjacent channel, any form of siqnal to coincide with 
the start of the sound so that the sample process could have been 
automatically trigqered. The alternative was to simultaneously 
monitor the signal by audio and visual means and manually trigger the 
sampling process. Obviously this technique has flaws and the exact 
point of entry into the data stream is uncertain. 

Even with the ambiguities, the fact that the formants are shifted 
up in frequency due to the forced sound level is apparent. The first 
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Power in db. Power in db. 




FIG. 23 frequency 

Transient Spectrum, Without Muzzle, 0-3 KHz 




Transient Spectrum, With Muzzle, 0-3 KHz. 
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Power in db. 




Transient Spectrum, 80 feet, 0-3 KHz. 
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formant can be estimated at 840 hertz, in aqreement with Finure 15. 

The second formant is centered at 1720 hertz and the third at 2550 
hertz. Bandwidth is extremely hard to estimate. 

2. With Oral Cavity 

A sample spectrum for the recordings made using the commun- 
icators with the oral cavity is shown in Fiaure 20 (a-c). The litera- 
ture concerned with small, non-radiating enclosures, such as the oral 
cavity used, states that in theory the effect of the muzzle is to 
raise the formants in frequency; however in reality, the distortion 
is much more complicated. "... The formants may become indistinct 
rather than shift in any direction..." [21]. These observations are 
partially borne out in the spectra referenced above. F^ does appear 
to be indistinct as does F^. F^ contains the major concentration of 
power, the center being shifted down instead of up in frequency. The 
inability to massage the data to make concrete observations is 
regretable. 

3. Increased Ambient Pressure With Oral Cavity 

A sample spectrum for the recordings made at 80 feet is shown 
in Figure 26 (a-c). The effect of increased pressure is well documented 
[1,5,6]. The formants should shift upward in frequency in a non-linear 
fashion, "...the first formant is affected more than the second 
formant. It is also observed that the intensity level of voiced 
sounds increased with increasing ambient pressure but that typical 
noise sounds, such as fricatives, and the burst part of unvoiced stons 
display a drop in intensity relative to voiced sounds ..." [6]. 

This referenced source cites a theoretical expression for computing 
the upward shift in frequency of the first formant. 
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Phoneme /AE/, 80 feet, 0-1 KHz. 




Phoneme /AE/, 80 feet, 1-2 KHz. 
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FIG. 26(c) 

Phoneme /AE/, 80 feet, 2-3 KHz. 
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F lo 2 * F ll 2 * F wl 2 (P-D 

where F^ is the frequency of the first formant at n atmosnheres 
pressure, is the first formant at one atmosDhere, and F^ is the 

resonance frequency of the closed vocal tract at one atmosphere. 

Aoolvina this expression to the situation under discussion, namely a 
depth of 80 feet, the theoretical first formant shtft can be calculated. 
Assuminq a resonant frequency of 170 hertz, and usinq the estimate of 
840 hertz for F^ , with p = 3.42 atmospheres, F-| is computed to be 
378 hertz, a shift of 38 hertz. Examininq the spectrum for 80 feet 
in Figure 26, the first formant can be roughly estimated to be 910 
hertz, significantly larger than expected. The source of this shift 
is uncertain; it could be related to the increased effort of the diver, 
as well as to the increase in pressure. F^ and F^ are indistinct, 
the power distributed in many frequencies throughout the band. 
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V. CONCLUSIONS AND RECOMMENDATIONS 



The original objective of isolating the distortion of the under- 
water communicators was not accomol i shed . Ths problem narrowed to 
develooinq a tool to examine the sneech power sDectrum in detail with 
high resolution. It is felt for non-noise like Dhcnemes that the 
FFT is the tool available to the analyst to perform this high resolu- 
tion analysis. Some of the final massaged output reinforces this 
belief. Line snectra can be the final result of speech wave forms if 
a number of the troubles encountered are corrected in subsequent 
research. To assist in future work a summary of the troubles encoun- 
tered and recommendations for correction are submitted. 

A. DATA COLLECTION 
1 . Equi pment 

The proper equipment must be available to support the research 
in all phases of data collection and remain available for recall during 
the Drocessinq phase. This affects a number of topics, range location, 
etc., but is best illustrated by the necessity for long term availabil- 
ity of the underwater communicators. In this experiment, the communi- 
cators were on short term loan and a number of experimental techniques 
were hastily contrived to support the data collection. 

The proper divina equipment must be made available to support the 
collection effort. Diver comfort in cold water requires the addition 
of heated suits to prevent changes in speech due to loss of heat. The 
orientation problem, discussed later, calls for a helmet or hood tvpe 
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lantern. The type developed for Sea Lab III by Ben Saltzer of Naval 
Undersea Warfare Center, Pasadena, California is recommended. Short 
term loan of oersonal equipment and use of outdated equipment compli- 
cates the overall effort and reduces efficiency. 

2 . Tape Recorder 

The tane recorder deserves special mention; the choice is 
critical. The PI 6200 was chosen because it could be adopted for 
dc power and provided the necessary linear frequency translation due 
to the slow sample rate. The PI 6200 tape recorder is delicate and 
required special handlinq, but is suitable for use at a fixed recordinq 
site. 

3 . Range Location 

The ranqe location is critical. The attempt to set up a ranqe 
in Monterey Bay durinq the worst weather in years cost many man hours. 
The ranqe was completed and used; however theft by local fishermen 
negated the effort. If Monterey Bav is to be used in the future (it 
should not!) the summer months provide the best ocean conditions, and 
notices should be out out to all fishermen askinq them to cooperate in 
avoiding destruction of the range. The Monterey Harbormasters Office 
will assist in this resoect. Time and effort can be saved if a better 
location is used, specifically. Stage I at NSDRL, Panama City, Florida. 

4 . Diver and Receiving Transducer Orientation 

Diver orientation must be closely controlled as must the 
orientation of the receiving transducer. The luna cavities and wet 
suit cause the transmission of the communicators to be hiqhlv direc- 
tional. If the diver is not oriented orooerly, the sianal strenoth will 



66 



be reduced; similiarly the WQC-1A transducer is directional, so must 
be oriented to orovide maximum siqnal reception. In Monterey Bav the 
divers orientation was fixed by a olatform allowinq him to sit down. 

No attemnt was made to orient the transducer in the ocean ranqe. At 
San Vicente Reservoir the qranite walls and concrete dam face reauired 
that the receiving transducer be oriented to oreclude reflections. The 
diver was oriented using a wrist comnass to maintain the nrooer line to 
the receiving transducer. The diver held a flashlight in one hand to 
assist in readinq a word list in total blackness, while he clung to a 
safety line to control his depth and location. It is no wonder the 
result was fading due to diver drift from the oroner diver/transducer 
line. A setun of the DICORS type would materially assist in solvinq 
the orientation problem [5]. 

5 . Safety 

Additional divers, preferably with a technical backciround, are 
needed for safety and convenience. Throughout this research, the divers 
violated normal safety standards by diving alone. This left one man 
at the surface to control and monitor the myriad of controls for re- 
cording. The ability of the topside man to monitor the data collection 
with confidence was hopeless. 

6. Spurious Noises 

The attenuation of noises associated with taoe recorder heads, 
interference of 60 cycle nature, surface and bottom reflections must 
be constantly monitored. Data should be recorded only if optimum 
conditions exist. 

7 . Forced Speech 

The data collection must support the data processing. The verv 
fact that two independent projects were undertaken reduced the 
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effectiveness of either project. The method of collecting data for 
intelligibility tests is vastly different than that to support the 
speech processing - the subject of this paper. Intelligibility tests 
are optimized when the diver raises the level of his voice and makes a 
determined effort to speak clearly. However, as mentioned repeatably, 
the task of isolating the problems of distortion by use of the power 
spectrum requires the diver to speak In the same manner throughout. 

The line spectrum desired is most nearly approached when the voice 
is articulated in a normal manner. This implies that instead of 
forcina the voice on the surface to aqree with the technique used for 
clarity at depth, the voice must be used In a normal manner at each 
staqe of the experiment, a reverse of what was done. 

B. DATA PROCESSING 

The processing technique must be thoroughly understood and be used 
successfully prior to the collection of the data. Lack of a hybrid 
interface and limited availability of the underwater communicators 
precluded these conditions being met. 

1 . Sample Rate 

The hybrid computer facility should modify the write scheme 
to ensure that the A-D conversion rate is high enough to accommodate 
speech siqnals. The desired rate is at least ten kilohertz. The 
analoq computer, the COMCOR 5000, is presently rated at 25 kilohertz; 
the limitinq item is the dioital computer, the SDS 9300. The limita- 
tion of 2.5 kilohertz forced the conversion of the data at a playback 
speed of 3.75 ips, one-tenth the record speed. In theory, no error 
is introduced but it is desirable to process at the record speed. 
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2 . F i 1 te ring 



Filterinq of the siqnal prior to A-D conversion should be 
variable and of high quality. The Krohn-hite filter is suitable for 
such work and in many ways is superior to a fixed Bandoass filter 
using active devices. A voltage controlled bandoass active filter 
would also be suitable. 

3 . Interfacing 

Converting the seven track, 24 bit tane to a nine track, 32 
bit tane was a oarticular stumbling block, requiring lonq hours to 
overcome. Appendix B is included to assist in overcoming this 
problem in the future. 

4 . Output Format 

The output from the IBM 360 can be displayed in a variety of 
ways; however, the Cal Comp Offline plotter is most suitable for 
analysis. A more efficient draw routine would make processing easier 
and save compute time and core memory. 
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APPENDIX A GLOSSARY 



PHONEME— one of a finite (40) number of sounds that characterize 
speech. Six major subgrouos (classes) are shown in Table VII. 

FRICATIVE— a noise-like sound, primarily produced by blowing air over 
the teeth, as in "s", "sh", and "f." 

STOP— a sound produced by holding the breath and exnloding it in a 
burst of noise, as in "p", "t", and "k." 

VOWEL— a steady state sound as in all common vowels. 

GLIDE— a sound formed by continually moving the articulators 
(mouth, tongue, etc.), as in "1", "r", and "w." 

NASAL- "m" and "n"-like sounds. 

DIPTHONG— two or more vowels sounded together 

PLOSIVE— a stop 

FORMANT— the resonant frequencies of the vocal tract. 

AFFRICATIVE— a nhoneme beginning with a stop and ending with a 
fricative, as "ch" in the word "church." 

PITCH— a qualitative dimension of hearing related to the highness 
or lowness of a tone and the sound intensity (with increased 
intensity, a relative change in pitch can occur with a change 
in frequency.) 
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TABLE I [21] 



STOPS 






FRICATIVES 






/p/ 


nat 


/ b/ 


be 


/ h/ 


house 


/v/ 


vest 


/t/ 


to 


/d/ 


day 


/f/ 


fee 


1*1 


then 


m 


key 


/q/ 


qo 


191 


thin 


/ z/ 


zoo 










/ s/ 


see 


0 / 


azure 










/j7 


she 







NASALS 


GLIDES 


/m/ 


me 


/w/ 


we 


/n/ 


need 


A1/ 


you 


/{?/ 


song 


/r/ 


read 






/!/ 


let 









VOWELS 


( including dipthonps) 






/I/ 


eve 


/AE/ 


at 


77 


boy 


/u/ 


foot 


/I/ 


it 


/7 


ask 


/n/ 


not 


/A/ 


boot 


/e/ 


hate 


/a/ 


up 


/o/ 


all 


/ai / 


I 


7*7 


met 


/ei / 


say 


77 


obey 


/au/ 


out 






7,5/ 


father 






/ou/ 


90 
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APPENDIX B 



The ournose of the anpendix is to qive a brief but complete 
description of the mechanics involved in digitizinq data and inter- 
facino from the SDS9300 to the IBM 360/67. 

A. ANALOG TO DIGITAL CONVERSION 

The analoo sional is the input to the Comcor Cl 5000. For all 
siqnals, a bandpass or lowoass filter network is necessary to prevent 
aliasina, a major problem. The upper cutoff frequency should be the 
hiohest frequency of interest. The filtered signal is olugqed into an 
amplifier input on the Cl 5000 and amplified by the highest possible 
factor withir, the 1100 volt limits of the operational amplifiers. 

This is required to yield the highest possible sional to noise ratio. 

The output of the amplifier is the input to Trunkline 500(T500) on the 
analog patchboard. This links the analoq to the digital computer. 

The sampling frequency can be selected up to and including 2500 
hertz for the present diqital maqnetic tape write scheme. If the analoo 
clock pulse is used, a selection of 10, 100, or 1000 hertz is available 
on the logic patchboard and should be connected into the Master Clock 
Input (MC IN). The output of the Master Clock (MC) is oluqqed into 
Trunkline 210 (T210) on the logic patchboard. 

If a variable sampling frequency is desirable or if a rate differ- 
ent than what is available on the loqic board is required, an oscillator 
can be plugged into a comoaritor input, (say C03) on the analog patch- 
board. The output of the comoaritor on the logic patchboard is plugged 
into Master Clock In (MC IN). The output of the Master Clock (MC) is 
connected to Trunkline 210, (T210). 
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The samplinq program for the SDS 9300 is titled SAMPL, and was 
developed by Lt. Lynn Dorrian for the SDS 930 [14]. It was modified 
for the SDS 9300 by Lt. Jerry Post and Mr. R. Limes [15]. The proaram 
will sample one analog sianal. The data is stored in buffers and 
written on maanetic tape from these buffers. It is designed to fill 
one buffer while the other buffer is dumpino onto the mannetic taoe. 
Since the write scheme for the magnetic tane is inherently slow, the 
samplinq rate is presently limited to a maximum of 2500 Hz. If only 
two buffers of data is desirable, this rate can be sliahtl.y exceeded. 

The procedure is to call the Subroutine SAMPLE (NREC,NSAMP,IBUFO, 
IBUF1 ,NTAPE,ITOG) where: 

NREC = number of records/run. 

NSAMP = number of samples/record. 

IBUFO = first data buffer. 

IBUF1 = second data buffer 

NTAPE = magnetic tape unit number 

ITOG = 0 if IBUFO is currently being filled. 

= 1 if IBUF1 is currently being filled. 

After the program has been accented by the SDS 9300 and the parameters 
initialized, the digital computer waits for the analog computer to ao 
to "compute." The compute switch can be manually trinoered or trionered 
by a control sional. 

Once the A/D routine commences, it continues until either the analog 

goes out of "compute" or the number of records of data has been taken. 

Insure after the proper number of records have been taken that the key- 
board is put back into reset, if trigoerino manually. Control returns 
to the diaital program. The digital proaram waits for instructions. 

If Sense Switch 6 is lighted, the program returns to initialization and 
waits for the analog to aaain ao to compute to diaitize a new run of 
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data. If Sense Switch 5 is liahted, the maqnetic tane is rewound to 
load noint. If Sense Switch 4 is liahted, the End of File condition is 
entered on the tape. Sense Switch 3 will cause the line printer to 
print the last 2 records of data in a print plot routine. 

Enclosure (1) is a diaqram of the Cl 5000 analoq and loqic patch- 
board confiauration for the A/D conversion. Enclosure (2) is the 
oronram listina for the Fortran IV routine utilized. 

Below are listed a number of hints to save effort and time. 

1) Use a maanetic tape provided by the IBM 360/67 computer facility 
as it has a permanently marked load point with the correct soacina. 

The tape unit number is 1 and should be set on "automatic." 

2) Ensure that digital proqram uses fresh cards, particularly the 
RTM BOOT card; a feed check on the card reader is often sensed as a 
oroaram error. 

3) Follow basic 9300 instructions. If the control console fails to 
start proaram feed, turn off the system and start over. This "immediate 
action battle drill" could save many hours. 

4) Use multichannel oscilloscope on the analoq console to monitor the 
pre-filtered sianal, the post^fi 1 tered sianal and the amplified sianal. 
This will allow a visual check to ensure proper analoa action. 

5) Monitor hybrid interface console durina diaitizina to ensure 
that the system is workinq. 

6) Make at least the first two and last two runs from a known analoq 
sianal (known frequency and amplitude). This is necessary to ensure the 
conversion to the IBM 360 is complete and correct. An additional sua- 
aestion is to insert a run of known data periodically throuqhout the 
data of interest to also allow a check of the conversion. 

7) Read NPGS Computer Facility Technical Note 0211-64 prior to 
attemntina A/D conversion. 
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B. CONVERSION TO THE IBM 360/67 

The proaram used to convert from seven track, 24 bits to nine track, 
32 bit data is listed in Fnclosure (3). The proaram is both Fortran IV 
and assembly language. The control cards are listed below. 

Card No. 

1 //Name (standard job card) ... 

2 //FOURIER EXEC FORTCALG, TIME. G0=05, REGION .GO=160K 

3 //FORT. SYSIN DD * 

(main FORTRAN IV PROGRAM) 

/* 

4 //ASM. SYSIN DD * 

(ASSEMBLY LANGUAGE PROGRAM) 

5 //GO. FT02F001 DD UNIT=OCO,VOL=SER=NPSYYY,LABEL=( ,BLP) , * 

6 // OISP=(OLD,KEEP) ,DCB=(DEN=0 ,RECFM=F,BLKSIZE= 

4096) 

7 //GO. FT04F001 DD UNIT=2400,V0L=SER=NPSXXX,LABEL=( ,SL) ,DISP= 

(MEM, KEEP), * 

8 // DCB= (DEN=2 ,RECFM=F,BLKSIZE=4096) ,DSNAME= 

Card #1,2,3, are the standard control cards for the NPGS Computer 

facility. Card 4 indicates the following program is in assembly lan- 
guage. Card 5 and 6 tell the computer where the seven track data is 
located and its format. For example, the card above says the 7-track 
data is located on tape NPSYYY, to be placed on tape drive 02, the 
label nrocessina is to be bypassed, with fixed format, with blocksize 
equal to 4096 bits, (1024 words). Cards 7 and 8 tell the computer 
where the data is to be transferred. The data is to be written on 
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tape NPSXXX, tape drive 04, with standard label, fixed format in blocks 
of 4096 bits (1024 words). The 1024 words corresponds to the number of 
samples/record in the Subroutine SAMPL on the SDS 9300 when NSAMP=1024. 

Anticipate difficulty in the conversion of the data. The tape 
drives are the weakest link and will require persistence to be success- 
ful. The first record of the data will probably be missed. The known 
data will allow a quick check of the system. 
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FILTERED SIGNAL TO 

OP AMP INPUT OP AMP OUTPUT 



indNI OW OINI 0399nid 
donavdNoo jo indino 




0L201 OINI OdOOmd 
iflO )IOOTO d31SVW 



o 

- 4 - 

indino 

aonavdwoo 




donavdwoo oi indNI 
dOiVTTIOSO 
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ANALOG PATCHBOARD LOGIC PATCHBOARD 



Enclosure(2) to Appendix B 



r 

C 

C 

c 

r 

r 

c 

r 



THIS PROGRAM 
ROND AND MYATT 



samples 

DIVERS 



SPEECH 



* IN COLUMN 6 INDICATES CONTINUATION FROM PREVIOUS CA»D 
FOR THIS PROGRAM LIST ONLY 



AFnRTPAN GO.LS 

DIMENSION IBUF1 ( 102*) , l BUF 2! 1024) ,BUF < 2048 , 2 ) , J XY f 2 ) , Y 
*<< 513) 

NAMELIST NCBLKS .LENGTH, ITAPE, ITOG 

I T ADE=1 

NOP LK c = 2 

L FNGTH = 1024 

I TOG=0 

J X Y( 1 ) = 1 

JXYI 2) =2 

1 OUTPU T ( 101 ) NOBLKS, LENGTH, ITAPE, ITOG 
WRITE! 102,104) 

104 FORMAT! * TYPF IN CHANGES.*/* FOLLOW WITH AN * ANO RETU 
*RN. *) 

INPUT! 1C1 ) 



101 

2 

3 

102 

4 

5 
4 



7 

0 



10 

103 

20 

12 

14 

lft 

11 



10 



13 



ANO CLEAR PAUSE.*) 



FORMAT! 212) 

IF( ITAPE) 3,4,3 
WRITE! 1C2 , 102 ) I TAPE 
cnp^ATf* CHECK ON TAPE*I1* 

PAUSE 10 

CALL SAMPLINCBLKS, LENGTH, IBUF1, IBUF2, ITAPE, ITOG) 

IF! I TOG .E 0. 0 )G0 TO 5 
CONTI NUE 
N T =0 

IF! ITOG.EO.-DGO TO 10 
IF! ITOG.EO.DGO TO 7 
CONTINUE 
NT = 1 

IF! ITOG.EO.-DGO TO 10 
GO TO 5 
WRITE! 102,103) 

FOPMAT! //5X*END 
p AUSF 20 
IF( SENSE SWITCH 
IF! SENSE SWITCH 
IF! SENSE SWITCH 
IF! SENSE SWITCH 
ML = ! 1-NT)*LENGTH 

nl=nt*length 

OQ IP 1=1, LENGTH 
BUF! I , 2 ) =1 

BUF! ( I+LENGTH) ,2) =DLENGTH 
BUF! H-ML,1)=IBUF1 ! I ) 

PUF! ! I+NL) ,1 )= IBUF2! I ) 

L4=LENGTH/2 

L=LENGTH*2 

CALL LONGPLOTtBUF, JXY,L, 2048, 1,0, 0.0, 0.0, 0.0, 0.0, L4,YS 
GO TO 2C 



RUN. . . */ *S ET DESIREO SENSE SWITCH.*) 

3) 11 ,12 

4) 13,14 

5) 15.16 

6) 1,2(5 



ENDFILE ITAPE 
GO TO 20 

15 RFWIND 1TAPF 
GO T 0 20 
END 

4mctaR300 SI ,LO,GC 
$campl P7F 
* 



* 

♦ 

* 

* 

* 

* 



THIS PROGRAM SAMPLES ONE 
A T A RATE DETERMINED BY A 
*nf 

0210. ANALOG TO DIGITAL CONVERSION 
OA t a IS STOPED IN BUFFER S SET UP IN 
CALLING SEQUENCE IS... 



ANALOG SIGNAL ON TRUNKLINE 050C 
CLOCK PULSE APPEARING ON TRUNKLI 

IS PERFORMED AND THE 
THE CALLING PROGRAM. 
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* CALL CAMPL(NREC,NSAMP,IBUR0,IRUF1,NTAPE, ITOG) 

$ WHF P F • 

* NRFC=NUMBER CF RECORDS/RUN 

* N SA HP =NU M BE R OF MPLE S /P ECOP.O 

* IBUF0=FIRST DATA buffer 

* IBUF 1 = SFC OND DATA BUFF C R 

* IBUFO=FIRST DATA BUFFER 

* BUFFERS APF FILLED WITH NS AMP A/D SAMPLES AL T FpNA 



♦ 

♦ 

* 

♦ 

* 

* 



* 

* 

* 

* 



♦ T EL Y 

NTAPE = MAG TAPF UNIT NUMBER 

I F =0 » NC TAPE IS WRITTEN 
I TOG=0 IF IB UFO CURRENTLY BEING FILLED 
=1 IF TBUF1 CURRENTLY BEING BILLED 



INITIALLING, THF SUBROUTINE WAITS FOR THE ANALO 
END OF RUN IS CAUSED BY ANALOG COIN 



AFTER 
♦G TO 

GO TO COMPUTE. 

*0 OUT 

OF COMPUTE OR WHEN NPEC RECORDS OF DATA HAVE B C RN TAKE 
*N • 

IN EITHER CASE, ITOG IS SET TO -1 TO NOTIFY THE CALLIN 
♦G 

ROUTINE OF THE ENDRUN CONDITION. 

NOTE THAT AFTER INITIALIZING THE A/D SAMPLING, CONTROL 
*1 c 

RETURNED TO THE CALLING ROUTINE AND INTERRUPTS CALL TH 
♦ E 



* 

X 


A/D INPUT ROUTINE. 




¥ 

T T G 


OPD 


0100000 






BRM 


9SETUPN 






R7E 


6 




NPEC 


I TG 




NUMBER OF RECOROS/RUN 


NS4MP 


! T G 




NUMBER OF SAMPLES/RECORD 


BUFRO 


I TO 




FIRST DATA BUFFER 


B'JFFl 


I TG 




SECOND DATA BUEFFR 


TADC 


I T G 




MAG TAPE UNIT NUMBER 


T OGGL 


I T G 




TOGGLE. . .INDICATES BUFFER 




*G FILLED 






$ 






OR ENDRUN CONDITION. 


A 


EQU 


B 




M4 


EOU 


-5 




B 


EOU 


4 




XI 


EOU 


1 




X2 


EOU 


2 




X3 


EOU 


3 






EOM 


033001 


SETLINFS I AND 2 FALSE 




POT 


=0 






LDA 


♦TOGGL 






STA 


TIME 






LDA 


=-l 






S T A 


NOTAPE 






LDA 


♦ TAPE 






Ftr 


=0T 






SKU 


=0 






sta 


NCTAPE 






COPY 


( A ,R ) 






MR G 


WTB X 






STA 


WTRN 






COPY 


IB, A) 






MRG 


EFTX 






STA 


EFTN 






COPY 


<B,A) 






MRG 


TRTX 






STA 


trtn 






COPY 


( B « A ) 






MRG 


BTTX 






STA 


B T T N 






LDA 


♦NSAMP 






COPY 


( A ,B ) 






SUR 


= 1 






MRG 


PLACE 
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GMN 



RTA 


SOA TA T 


pta 


C UR X2 


COPY 


( 0,A ) 


ll^p 


14 


rtr 


OAT 


MPT, 


ALC 


<:Tfi 


A LC C 


RTA 


A LC 1 


CPPY 


( 0 » A ) 


LOR 


PUPFO 


LI RP 


Q 


LL r P 


1 


LR RP 


10 


LI RA 


0 


M P G 


ALCO 


RTA 


ALCO 


C PR Y 


(B,* ) 


MO G 


OAT 


RTA 


CWO 


Copy 


( 0,* ) 


LOB 


BU^Fl 


LL RP 


0 


LL RP 


l 


LP RB 


10 


Lt RA 


*5 


MP G 


A LC 1 


STA 


A LC 1 


POPY 


<B,A) 


^r. 


OAT 


RTA 


CWl 


LOA 


BUFFO 


= TP 


=077777 


*DO 


=-l 


RTA 


ORIGO 


RTA 


C0«M 


LOA 


BUFFI 


CTR 


=077777 


A[>rt 


=-l 


RTA 


0RIG1 


LOA 


OCW 


MP G 


=CO*M 


OTA 


R VC V 


RTA 


cw 


LPA 


8RMAD 


XMA 


040 


RTA 


R V040 


LOA 


BP M$A M 


XMA 


051 


RTA 


RV051 


RKN 


NCTAPF 


BPU 


AGAIN 


LOA 


RRMCL 


XMA 


010 


STA 


PVOIO 


LOA 


RRMFL 


XMA 


Oil 


RTA 


RVOll 


C XU 


TR T N 


CAT 


0 


PR I J 


1-2 


5 XU 


BTTN 


PRU 


1+2 


RRU 


AGAIN 


BC?m 


*014000 


dqt 


ro ACE 


r XtJ 


F c T N 


RT7 


*TOGGL 


S T 7 


TOGGT 


LOA 


=-l 


RTA 


AOFL 


LOA 


*NREC 


Ann 


=-l 
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**** 



STA 

PPM 

PQT 

SKS 

RP U 
PPM 
DOT 
FIR 
EOM 

FND OF P %l 
SURRPU T T wp 



COUNT 

C33C01 

=040000000 

030010 ANALOG 

*-l 

C33C01 
= 0 



SETLINF 1 TRUE 
IN COMPUTE TEST 



* 

* 

* 

c T4R t 



L0AP1 



R TC c T 1 



PTFST2 



PZE 
STA 
STX 
SKN 
BRU 
DIR 
LOA 
SKG 
BP U 
MPQ 
SKS 
BPM 

ST7 
EPM 
PPT 
FTP 
LDX 
BP X 
SKN 
RP U 
LOA 
STA 
LOA 
STA 
ST7 

Lnx 

SKN 

RP U 

EXU 

CAT 

BP U 

LDA 

ADM 

ADD 

SKL 

COPY 

STA 

EXU 

EXU 

POT 

BRU 



033004 

samrl 

TIAL SET UP 
START ENTERED 



SVA 

S VX2 » X2 

ADEL 

t-1 



FALSE 



ENABLE 

RETURN 



PATCHBOARD INTERRUPTS 
TO MAIN PROGRAM 



ON INTERRUPT 051 



♦TOGGL 

TOGGT 

LOADO 

COMM 

030010 

ENDPUN 

ADFL 

C34COO 

CW 



CURX2 ♦ X2 

RTDLE » X2 

5DFL 

T-l 

ORIGO 

COMM 

svcw 

c w 

♦TOGGL 
SDA TAT , X2 
NOTAPE 
INCR 
TRTN 
0 

t-2 
TIME 
♦BUFFI 
= 1 

= 1000 
(0,5) 

TIME 
WTBN 
A LC 1 
CWl 
INCR 



THIS PROCEDURE TESTS 
TO DETERMINE WHICH 
BUFFER TO LOAD 



ANALOG 

NO 



IN COMPUTE 



THIS DIVIDES THE SUBROUTINE INTO BUFFERS 



LOADO 



RTEST3 



MPO 


COMM 


SKS 


030010 


BRM 


ENDPUN 


ST7 


ADFL 


EOM 


C34000 


POT 


CW 


FIR 




LDX 


CURX2 ,X2 


BR X 


RIDLE ,X2 


SKN 


ADFL 


BRU 


*-l 


LDA 


0RIG1 


STA 


COMM 


LDA 


svcw 



ANALOG 

NO 



IN COMPUTE 
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CT/i 


c w 




mpp 


♦TQGGL 




inx 


SO A TA T , 


P tc<;T4 


CKN 


NO T A PF 




RPM 


I NCR 




?XIJ 


TP T N 




r r t 


0 




R° M 


«-2 




LOA 


▼T MF 




A n m 


♦ BUFFO 




Ann 


= 1 




C KL 


= 1000 




rnpy 


( 0 ♦ S ) 




ST * 


rf MC 




*Xl) 


WTBN 




c XU 


ALCO 




onr 


CWO 


t «jr c 


ckp 


COUNT 




BPU 


R IDLE 






RPM 


FIN 


C T^l - 


L n a 


c Vf. 




>TX 


C IJR X2 * X; 




LPX 


S VX2 ♦ X2 




RPC 


♦START 


P IM 


CQM 


033000 




J 0 
COM 


033001 




DOT 


=020000 




fKC 


C31001 




RP 1! 


t-1 




POM 


C33001 




6 n t 


= 0 




^ K P 


♦ TOGG L 




n Pll 


t-1 




fit T 


C 




«PII 


t-i 




LP * 


SVO^O 




<:ta 


C40 




LPA 


s V051 




*T* 


051 




l.HA 


*V010 




CT/S 


010 




1.0* 


SVOll 




CT* 


Oil 




PIP 
RP 1! 


R TOLF 


^nop dm 


07C 






OTP 

LPA 


r pmv 




C TO 


=C 7 7777 




C T 1 


CCMV 




i.n a 


= -l 




inx 


riJRX2,X 




CTA 


♦CO^M 




uoq 


c C MW 




RP X 


t-2 ,X? 




PT7 


CCI.J NT 




I PA 


F NO RUM 




cxr, 


= L 0 A D 0 




RPM 


R T c s T2 


A 


RP 1) 


R TP SJA 


* SMQ OF 


s L'P C CU T I NF 


P D MP L 


RO M 


PLUG 


R|. IJD 


C7 C 






per 


♦ OHIO 


svoir 


P 7 r 




« V ■* 1 I 


D7P 




* 


^ ~\I 


FOO V 


V * 

10,14 


r ">\] T 


FOP M 



BLOCK COUNT I< OEPUCfE'''. 

IF NEGATIVE, ALL OATA HAS BEEN 
transferred. 



setlinf 2 true 

WAIT FOR ANALOG 
ON TFST 1. 
SETLINF 2 FALSE 
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SPACE 


CONT 


150.0 

CIOOOOO 


new 


DATA 


svew 


P7F 




CW 


PZE 




comm 


PZ C 

P7.E 




SDATAT 


RES 


1 


OR TOO 


RFS 


1 


OR TGI 


RES 


1 


DAT 


RES 


1 


ALC 


FOM 


C14000 


ALCO 


PZE 




ALC 1 


P7E 




C. WO 


RFS 


1 


C.W1 


RF $ 


1 


TOGGT 


RES 


1 


COUNT 


RES 


1 


SVA 


P7F 




SVX2 


P7E 




CIJRX2 


P7E 




SVOAO 


RES 


l 


SV051 


RES 


1 


PLACE 


DA TA 


077700000 


T IM? 


PZE 




NOT/»PE 


PZE 




BTTX 


RTT 


0,0 


R T TM 


PZF 




WTBX 


W T B 


*0,0,* 


WTRN 


°ZE 




EFTX 


EFT 


0,0,4 


6FTN 


P7E 




TR T x 


TR T 


0,0 


TR T N 


PZE 




BRM^AM 


BRM 


START 


8RMAD 


8RM 


ADEND 


ADEND 


PZE 






SKR 


ADFL 




8PU 


f-1 




BRC 


♦ADEND 


A D P L 
4E0F 


PZE 

END 




4L0AD 


XM,MAP. 




A 


DATA 





83 



Enclosure(3) To Appendix B 

T H I c PROGRAM INTERFACES THE SDS 9300 TO T HF T BM360 



' /MY*Q0n c 3 JO* ( 00^3 ,47FP) , 'MYATT ’ , m$ GL FV F L = 1 , CL AS S= G 

//=rUPTFF F X r C FR T CALGP,TI mf.GO=10,RFG!ON.GO=160K 
//erP T .PY*IM n n * 

OIMPNSICN I NDA TA ( 1 024 ) ,DATA(1024) 

FArTpq =1 P 0 .O/(2**31-l ) 
op WING ? 

u 

no J = I,K WHERE K IS THF NUMBER OF BLOCKS OF 102'+ FAC 
dg 31 J= I . 1 33 

R FA n ( 2*3 »FND=40 , FRR = c 0 ) TNDATA 
3 cocmatI 16 ( 64A4) ) 

CALL F PR m ( i NOA T A ) 

WR T TF ( 6 , 7 0 ) J 

7q roPMAT (’ 1 ', 10X, ’RECORD N0.=’,I6) 

00 1 1=1,1024 

1 ^ T A( I ) = INnfiTAm ♦FACTOR 

WRITt(6,6A>) (OATAU) ,1=1 ,1024) 

66 FnpM,(T| 1X,10E10.2) 

WRITE (A t 3) DATA 
GO to 31 

FC WR! T F (6.51) J 

Fi fdpm£T ( *0’ ,5X, 'READ Error, RECORD N0.=»,I6) 

31 CONTINUE 

STOP 

40 WRIT* (6,4 1) J 

41 FORMAT ( ' 0 ’ , 5X , 'END O c TAPE, RECORD N0.=',I6) 

R TOP 

EN n 



//JR M.PYFIN DC * 

=QRM START 0 

x SUBROUTINE FORM(INDATA) 

* 



* % TMOATA of AN ARRAY LFNGTH SPECIFIED BY THE I N n EX VALU 
S' T H I S SUBROUTINE WILL CONVERT 24 BIT BINARY WORDS STORED IN 
S' «F TP 32 PIT BINARY WORDS AND PLACE THESE SAME WORDS BA 



THIS SUBROUTINE CONVERTS 
24 BIT BINAPY WORDS TO 
32 BIT WORDS 



S' tC K INTO 

* 

V 


INDATA 


r 

* 

CJM 


14,12, 12( 13) 


RALP 


6,0 


US I NG 


* , 6 


i)R ING 


DATA ,7 


SR 


7 


L 


1 1 , =F * 1 02 4 * 


L 


12, 0( 1 ) 


O n P L 


2,NUM( 12) 


L° 


3,7 


S D DL 


2,6 


co L 


2,2 


**P DL 


2,6 


S° L 


2,2 


FROL 


2,6 


F(?L 


2 ,2 


:rdl 


2,6 


ST 


3 » NUM ( 12) 


LA 


12,4(12) 


BC T 


11 , LOOP 


LM 


2,12,28(13) 


MV I 


1 2( 13) , X’FF » 


RC R 


15,14 


" >A "A DS5CT 




vll)M oc 


IF 


END 




/ /r,p .ft 02 FOO 1 I 


DO UNIT=OCO,V 



THIS IS THE I ND C X VALUE 
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// 01 SP*< OLD ,KFEP) ,DCB*(DEN*0 ,RECFM=F,BLKS IZE=40 

tR6) 

//r,n.FT04F001 00 UNIT=24 00»VOL*5sER = NPS234,LABEL = (»SL)t0IS°=( 
<NEW,KEFP). * 

// OCB*( DEN=2 ,RECFM«F t BLKSI ZE=40°6 » , DSNAMF=MYAOO 

tO*>3 
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COMPUTER PROGRAM HARM 

PROGRAM COMPUTES POWER SPECTRUM FOR 8192 DATA POINTS 



< IN COLUMN 6 INDICATES CONTINUATION 
FCP THIS PROGRAM LIST CNLY 



FROM PREVIOUS CARO 



/MYA 20053 JOB ( 0053 ,47FP ) , • MYATT •,MSGLEVEL»I 1 CLASS=D 

/FOURR EXEC FORTCLGP,PARM.FORT='LINECNT*75', TIME. 60*02, REGI 
f ON • GO=300K 
/FORT.SYSIN DO * 

REAL*4LABEL( 10I/40HONE TWO THREFOURFIVE S I XSEVNE IGTNI N 
$E TEN / 

REAL*8ITITLE( I2I/96H POWER SPECTRUM J.M. MYATT 

1 M**2 111-118 W MUZZLE SINGING 

DIMENSION XG (420 ) , D ( I 2601 ,Y ( 8200 J.B ( 1260 ) 

DIMENSION S ( 8200 ) . I NV ( 8206j . M?3 ) . DATA ( 1024 1 
C0MPLEX*8 A ( 8192,1, 1),Z(1260, 1,1) 

CALL CANCEL ( 2 ) 



I TYPE=2 
KK = 1 3 

TIME =ITIME(0)*0. 01 

N 1=2**KK 

NPT=N1 

NY=1 

NN=420 

NZT=420 

PROGRAM COMPUTES 8192 COEF. 

PLOT GOES TO 3000 HERTZ 
I MAX = 1 
BMAX=0.0 
KA=2 
KC = 2 
K0=2 

SAMPLING RATE IS 2048 CPS* 

XN=N1 

DT=1. 0/2048.0 
BW= 1.0/(2.0*DT) 

DELTF= 1 • 0/ ( XN*DT ) 

N2=l 
N3= 1 
M ( 1 ) =KK 
M ( 2 ) =0 
M ( 3 ) =0 
K 8=0 

NN= RUNNING COUNT OF POINTS BEING PLOTTED 
NZT= NUMBER OF POINTS PER GRAPH 
DELTF = FREQUENCY RESOLUTION 



J IS * OF BLOCK OF DATA ON MAG TAPE BEING READ 

DC 700 J= 1 , 1 1 8 
IO=KB*1024 

READ (4,50) (DATA( I ) ,1=1,1024) 

50 FORMAT ( 1 6( 64A4 ) ) 

IF( J.LT. Ill )G0 TO 700 
DO 777 1=1,1024 
777 Y ( IO + I >=DATA( I ) 

KB=KB+ 1 
700 CONTINUE 

DO 2 11=1, N1 
DO 2 12=1, N2 
DO 2 13=1, N3 
2 A ( 11,12, 13) = Y( ID 

WPITE(6,40)DT, J.Nl.NZT 

40 format( //// 3x, 'S ampling interval* • ,fio*6, iox, • sound 

t/AE/ ' 

*,10X,' BLOCK = ' , 16, 5X, 'DATA POINTS ', 217/ ) 
WRITE(6,66)9W, DELTF 



86 



no 



66 F0RMAT(//6X, ‘BANDWIDTH = • ,F15. 6, 10X, • DELTA F - *,F10. 
$ 6 ) 

WRITE (6, 10 J ( ( ( II. A< II, 12, 13) , IJ=1,N3), I2=1,N2) , I 

$ 1 = 1 , Nil 

10 FORMAT!//* AMPLITUDE CF FOURIER FUNCTION TO BE TRANSFO 
$R MED • /! 4! 1 X » 

♦ •AC, 15, • » = ',1X,2F7.3,4X)) ) 

CLOCK=I TIME(0)*C.01 

CALL HARM( A , M, I NV , S ,-l ♦ I FERR ) 

DC 34 11=1. NZT 
34 XG<I1)=I1-I 
KO IS TEST FOR SMOOTHING 
IFIKO.LT. 1)G0 TO 800 
Z< 1, 1,1 ) = A< 1,1 . 1) 

b(i»=cabs(Z( l.i.in 

DO 801 I 1 = 2 ,1260 
12 = 11-1 
13=1 1+1 

ZUl.l, 1)=-0.25MA< 12,1,1 H-A! 13,1,1) CO. 5*A! 11,1*1) 
B(I1)=CABS(Z(I1,1,1)) 

BII1)=BII1I**2 
801 CONTINUE 

WR ITE ( 6 » 965 ) 

965 FORMAT ( lHO , • FIRST SMOOTHING FUNCTION USED •) 

GO TO 820 

800 DO 33 11=1,1260 

B< I1)=CABS!A(I 1,1, 1M 
33 B ( 1 1 ) = B ( 1 1 ) **2 
820 DO 11 11=1,420 

11 XG!I1)=XG!I1)*DELTF 
Jl= 0 

KC IS SECOND SMOOTHING FUNCTION CHECK 

IF(KC.GT • 1 )GQ TO 587 
990 DO 666 I 1 = 1,1260 

IFIBdl) .LT.BMAXIGO TO 666 
BMAX=B (III 
I MAX=I 1 

666 CONTINUE 

WRITE(6, 667)8 MAX.IMAX 

667 FORMAT ( 1H0, ' BMAX = SF7.3,* IMAX = ‘,16) 

DO 78 J3=l , 3 

J2= 1 

KA IS CHECK FOR L0G10 PLOT 
IFIKA.LT. 1)G0 TO 840 
DO 850 I1=NY,NN 
B( I1)=B( Ill/BMAX 
IF(BUl) .LE.0.002IG0 TO 1 
D( J2)=10.0*AL0G10(B(I1M 
IF(D( J2).GE.-27.0)G0 TO 850 
1 D(J2)=-27.0 

850 J2=J2+1 

851 CALL DRAW(NZT,D,XG,0,0,LABEL(J3) f ITITLE,3.,7.0,0,0,0,0 
$,9, 15,1, LAST) 

GO TO 855 

840 DO 77 I 1=NY , NN 
D( J2) = B( 1 1 ) 

J2=J2+1 
77 CONTINUE 

CALL DRAW(NZT,D,XG, 0,0, LABEL! J3 ) , IT ITLE, 0,7. 0,0, 0,0,0, 
$9,15,1, LAST) 

NOTE $$$$$$$ X AND Y AXIS ARE REVERSEO 



855 WR ITE ( 6 * 20 ) ( ( (II, 
$1=NY ,NN ) 



A( 11,12,13) ,13*1, N3),I2=1,N2), I 



20 FORMAT ( //• FOUR I ER TRANSFORM A ( 1 1 , 1 2 , 13 ) • / (4 ( IX, • A ( • 1 5 , 
$ ‘)=*,1X 

♦, 2F7.3,4X))) 

NY=NN+ 1 
NN=NN+420 
78 CONTINUE 
I TYPE=2 
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GO TO 35 

987 0< 1 )=B( 11 

PC 988 11=2,1259 
12 = 1 1-1 
13=11+1 

D( II )= 0. 5*B ( I 1 I +0. 25* ( B ( 1 2 ) +B ( 13) ) 

988 CONTINUE 

CC 989 11=1,1259 
B ( II )=D( III 

989 CONTINUE 
WFITEI6.966I 

966 FORMAT ( 1H0 , ’ SECOND SMOOTHING FUNCTION USEO') 

GO TO 990 
35 CONTINUE 
END 

//GC.FTC4F001 DD DSNAME=MYA0053 , UNI T = 2400 , DCB* ( RECFM*F , BLKS I 
1ZE=4096I, * 

// VOLUME=( PRIVATE, SER = NPS2 34 I 



88 



BIBLIOGRAPHY 



N -M . Communication Sciences Laboratory/Office of Naval Research 
Report 20, Abstracts of Four Studies on Underwater 
Communications, by H. Hollien, J. Rrandt and J. Malone , 

1967 - 68 '. 

2. Special Issue on Fast Fourier Transform, - Institute of 

Electronics Engineers Transactions on Audio and Electro- 
acoustics , June 1967. 

3. Hydro-Products Division of Dillingham Company, Facts on Under - 

water Communications, 1969 . 

N 

+■4. CSL/ONR Report 15, The Effect of Air Bubbles in The External 
Auditory on Underwater Hearing Thresholds , by H. Hollien, 

T'.E'.' Doherty and J. Brandt, March 1967. 

5. CSL/ONR Report 11, Intelligibility of Diver Communication 

Systems, by H. Hollien, J. Brandt and T.E. Doherty, 15 
December 1968. 

6. Fant, C.G.M., "Speech at High Ambient Pressures," Acoustic 

Specification of Speech , Royal Institute of Technology, 

Speech Transmission Laboratory, Sweden. 

7. Mullen, W.W.Jr., "Literature Survey Pertinent to Underwater 

Voice Communication," Technical Motes (Unpublished), U.S. 

Navy Mine Defense Laboratory, Panama City, Florida. 

8. Brubaker, R.S. and Wurst, J.W., "Spectrographic Analysis of 

Divers' Speech During Decompression," Journal of Acoustical 
Society of America , v. 43, no. 4, 1968. 

9. Potter, R.K. and Kopp, G.A., Visible Speech , Dover Publications, 

Inc. , New York, 1966. 

10. Comer, D.J., "The Use of Waveform Asymmetry To Identify Voiced 

Sounds," IEEE Transactions on Audio and Electroacoustics , 
v. AU-16, no. 4, December 1968. 

11. Lindgren, Nilo, "Speech-Mans Natural Communication," IEEE 

Spectrum , v. 4, no. 6, June 1967. 

12. Oppenheim, A.V., "Speech Analysis-Synthesis System Based on 

Homomorphic Filtering," JASA, v. 45, p. 458-465, February 1969. 



89 



13. Priestly, M.B., "Power Spectral Analysis of Non-Statlonary 

Random Processes," Journal of Sound and Vibrations, v. 6, 
p. 86-97, July 1967. 

14. Dorrian, L.V., "Digital Spectral Analysis," Masters Thesis, 

Naval Postgraduate School, Monterey, California, September 1968. 

15. Post, J.L., "Analysis and Synthesis of a Time Limited Complex 

Wave Form," Masters Thesis, Naval Postgraduate School, 

Monterey, California, December 1968. 

16. Bingham, C., Godfrey, M.D., and Tukey, J.W., "Modern Techniques 

of Power Spectrum Estimation," IEEE Transactions on Audio 
and Electroacoustics , v. AU-15, no. 2, June 1967. 

17. Singleton, R.C. and Poulter, T.C., "Spectral Analysis of the 

Call of the Male Killer Whale," IEEE Transactions on Audio 
and Electroacoustics , v. AU-15, no. 2, June 1967. 

18. Bremerman, H.J., "Pattern Recognition Functional and EntroDy," 

Contract NONR 3656 (08) and NONR 222 (88), December 10, 1967. 

19. Gold, B. and Rablner, L.R., "Analysis of Digital and Analog 

Format Synthesizer," IEEE Transactions on Audio and Electro - 
acoustics , v. AU-16, p. 81-94, March 1968. 

20. Fischer-Jorgenson, E., "What Can the New Techniques of Acoustic 

Phonetics Contribute to Linguistics," Psychol Ingulstlcs , 

Holt, Rinehart and Winston, New York, 1961 . 

21. Morrow, C.T., "Reaction of Small Enclosures on the Human Voice, 

Part II," JASA, v. 20, p. 487-97, July, 19/8. 

4 

22. Drucher, H., "Speech Processing in High Ambient Noise Environ- 

ment," IEEE Transactions on Audio and Electroacoustics , 
v. AU-16, no. 2, p. 166, June 1968. 

23. Hunter, E.K., "Problems of Diver Communication," IEEE 

Transactions on Audio and Electroacoustics , v. AU-16, 
no. 1, p. 118-21, March 1968. 

24. Kenny, J.E., "Some Design Problems in Wireless Diver Communi- 

cations," IEEE Transactions on Audio and Electroacoustics , 
v. AU-14, no. 4, p. 174-177, December 1966. 

25. Morrow, C.T., "Reaction of Small Enclosures on the Human Voice, 

Part I," JASA, v. 19, p. 645-652, July 1947. 



90 



u 



26. Olson, H.F., and others, "Speech Processing Techniques and Appli- 

cations," IEEE Transactions on Audio and Electroacoustics , 
v. AU-15, no. 3, September 1967. 

27. Parzen, E., "Informal Comments on Uses of Power Spectrum 

Analysis," IEEE Transactions on Audio and Electroacoustics , 
v. AU-15, no. 2, p. 75-76, June 1967. 

'28. Pickett, J.M., "Low Frequency Noise and Methods for Calculating 
Speech Intelligibility," JASA , v. 31, September 1959. 



91 



INITIAL DISTRIBUTION LIST 



Mo Conies 

1. Defense Documentation Center 20 

Cameron Station 

Alexandria, Viroinia 22314 

2. Library, Code 0212 2 

Naval Postgraduate School 

Monterey, California 93940 

3. Professor Harold A. Titus, Code 52TS 1 

Department of Electrical Enaineerinq 

Naval p ostaraduate School 
Monterey, California 93940 

4. Maj. William A. Bond, USMC 1 

Rox 116 

Nocona, Texas 76255 

5. Cant. J.M. Myatt, USMC 1 

2027 Ferndale Avenue 

Dallas, Texas 75224 

6. Commandant of the Marine Corns (Code A03C) 1 

Headquarters, U. S. Marine Corns 

Washington, D. C. 20380 

7. James Carson Breckinrldqe Library 1 

Marine Corns Development and Educational Command 
Ouantico, Virginia 22134 

8. Dr. Harry Hollien 1 

Communications Sciences Laboratory 

Department of Sneech 
University of Florida 
Gainesville, Florida 32601 

9. Mr. James Lasch 1 

Hydro Products 

Divison of Dillinqham Cornoration 
Box 2528 

San Dieqo, California 92112 

10. Mr. Omar Lamborn 1 

Electro-Dynamics Divison 
Bendix Cornoration 
11600 Sherman Hay 

North Hollywood, California 92112 



92 



No. Coni os 



11. LCDR A1 Spinks 1 

Naval Special Warfare Group 

Naval Amphibious Base 
Coronado, California 

12. Mr. Monns Turntine 1 

Naval Applied Science Laboratory 

Flushing and Washington Avenues 
Brooklyn, New York 11251 



93 



Security Classification 



DOCUMENT CONTROL DATA - R 4 D 

(S»cur|fy claaattlcatlon ot tltla, body ot abstract and Inditing annotation muat ba anMrod_whan_Jho_ovaMlt_to^ 

1 ORlOlN A TINO ACTIVITY (CofporTfT Aufhorj 20, RI^ORT ItCURITV C L Alii F I C A TION 

Naval Postgraduate School Unci nssi f j ed 

Monterey, California 93940 >6 - °" ou '‘ 

J REPOR T TITLC 

Investigation of Distortion of Divers 1 Speech Using Power Spectral Estimates 
Based on the Fast Fourier Transform 

4. DUCRIRTIVI NOTH (Typa ot T9 pot t and m lnctuatva data*) 

Master's thesis June 1969 

• AUTHORlII (Fltat namta, midMa initial, Juimmij 

William H. Bond 
James M. Myatt 



• REPORT O A T I 

June 1969 


7a. TOTAL NO. OF PAGES 

96 


7b. NO OF RCfl 

28 


». RROJICT NO. 
c. 

A 


M. ORIGINATOR’S REPORT NUMBER(S) 


SO. OTHER REPORT NO(S) (Any othar numbara that may ba aaaignad 
thla raport) 


10 DISTRIBUTION STATlMKNT 

This document has been approved for public release and sale; its distribution 
is unlimited. 


n. SURRLKMKNTARY noth 


12. SPONSORING MILI TAR Y ACTIVITY 

Naval Postgraduate School 
Monterey, California 93940 


M a Bs T R aC t 



The problem of distortion in underwater communications peculiar to free 
divers and techniques for analysis of speech wave forms are discussed. The 
Fast Fourier Transform algorithm, selected to analyze shifts in formant fre- 
quencies due to restricted oral cavities, high ambient pressures, and forced 
speech is discussed. The Fast Fourier Transform is used to analyze a vowel 
sound and show that the expected shifts do occur. Recommendations are made 
for extending the techniques to all non-noise like sounds and breathing 
mixtures other than compressed air. 



DD ,"“.1473 

S/N 



(RAOC I) 






HIUI 



>m vtirtt\ C*1 h* Mfirat ion 



1 4 

K C V WORD* 


LINK A 


LINK • 


LINK c 




W T 


AO L C 


W T 


not I 


W T 


Underwater communication 
Divers' speech 
Fast Fourier Transform 
Power spectral estimates 
Speech processing 












- ' 


DD ,'.“".“..1473 'back, 











Security Claailflcation 

96 









1 



Jh«»B679 

°* distortion of divers' s 

II 




3 2768 002 07487 4 
DUDLEY KNOX LIBRARY 



III 













